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Computer programs are portable to the extent that they can be moved 
to new computing environments with much less effort than it would take 
to rewrite them. In the limit, a program is perfectly portable if it can be 
moved at will with no change whatsoever. Recent C language extensions 
have made it easier to write portable programs. Some tools have also 
been developed that aid in the detection of nonportable constructions. 
With these tools many programs have been moved from the PDP-11 on 
which they were developed to other machines. In particular, the UNIX* 
operating system and most of its software have been transported to the 
Interdata 8/32. The source-language representation of most of the code 
involved is identical in all environments. 


I. INTRODUCTION 

A program is portable to the extent that it can be easily moved to 
a new computing environment with much less effort than would be 
required to write it afresh. It may not be immediately obvious that 
lack of portability is, or needs to be, a problem. Of course, practi- 
cally no assembly-language programs are portable. The fact is, how- 
ever, that most programs, even in high-level languages, depend 
explicitly or implicitly on assumptions about such machine- 

* UNIX is a trademark of Bell Laboratories. 


2021 



<r 

dependent features as word and character sizes, character set, file 
system structure and organization, peripheral device handling, and 
many others. Moreover, few computer languages are understood by 
more than a handful of kinds of machines, and those that are (for 
example, Fortran and Cobol) tend to be rather limited in their 
scope, and, despite strong standards efforts, still differ considerably 
from one machine to another. 

The economic advantages of portability are very great. In many 
segments of the computer industry, the dominant cost is develop- 
ment and maintenance of software. Any large organization, cer- 
tainly including the Bell System, will have a variety of computers 
and will want to run the same program at many locations. If the 
program must be rewritten for each machine and maintained for 
each, software costs must increase. Moreover, the most .effective 
hardware for a given job is not constant as time passes. If a non- 
portable program remains tied to obsolete hardware to avoid the 
expense of moving it, the costs are equally real even if less obvious. 
Finally, there can be considerable benefit in using machines from 
several manufacturers simply to avoid being utterly dependent on a 
single supplier. 

Most large computer systems spend most of their time executing 
application programs; circuit design and analysis, network routing, 
simulation, data base applications, and text processing are particu- 
larly important at Bell Laboratories. For years, application programs 
have been written in high-level languages, but the programs that 
provide the basic software environment of computers (for example, 
operating systems, compilers, text editors, etc.) arc still usually 
coded in assembly language. When the costs of hardware were large 
relative to the costs of software, there was perhaps some justification 
for this approach; perhaps an equally important reason was the lack 
of appropriate, adequately supported languages. Today hardware is 
relatively cheap, software is expensive, and any number of 
languages arc capable of expressing well the algorithms required for 
basic system software. It is a mystery why the vast majority of com- 
puter manufacturers continue to generate so much assembly- 
language software. 

The benefits of writing software in a well-designed language far 
exceed the costs. Aside from potential portability, these benefits 
include much smaller development and maintenance costs. It is true 
that a penalty must be paid for using a high-level language, particu- 
larly in memory space occupied. The cost in time can usually be 
controlled: experience shows that the time-critical part of most 
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also into deeper questions of the model of floating point computa- 
tions on the various target machines. 

In parallel with this work, the development at Bell Laboratories of 
SnoboH 7 marks one of the first attempts at making a significant 
compiler portable. Snobol4 was successfully moved to a large 
number of machines, and, while the implementation was sometimes 
inefficient, the techniques made the language widely available and 
stimulated additional work leading to more efficient implementa- 
tions. 

III. PORTABILITY OF C PROGRAMS - INITIAL EXPERIENCES 

C was developed for the PDP-11 on the UNIX system in 1972. Por- 
tability was not an explicit goal in its design, even though limitations 
in the underlying machine model assumed by the predecessors of C 
made us well aware that not all machines were the same. 2 Less than 
a year later, C was also running on the Honeywell 6000 system at 
Murray Hill. Shortly thereafter, it was made available on the IBM 
370 series machines as well. The compiler for the Honeywell was a 
new product, 8 but the IBM compiler was adapted from the PDP-11 
version, as were compilers for several other machines. 

As soon as C compilers were available on other machines, a 
number of programs, some of them quite substantial, were moved 
from UNIX to the new environments. In general, we were quite 
pleased with the ease with which programs could be transferred 
between machines. Still, a number of problem areas were evident. 
To begin with, the C language was growing and developing as 
experience suggested new and desirable features. It proved to be 
quite painful lo keep the various C compilers compatible; the 
Honeywell version was entirely distinct from the PDP-11 version, 
and the IBM version had been adapted, with many changes, from a 
by-then obsolete version of the PDP-11 compiler. Most seriously, 
the operating system interface caused far more trouble for portability 
than the actual hardware or language differences themselves. Many 
of the UNIX primitives were impossible to imitate on other operating 
systems; moreover, some conventions on these other operating sys- 
tems (for example, strange file formats and record-oriented I/O) 
were difficult to deal with while retaining compatibility with UNIX. 
Conversely, the I/O library commonly used sometimes made UNIX 
conventions excessively visible— for example, the number 518 often 
found its way into user programs as the size, in bytes, of a particu- 
larly efficient I/O buffer structure. 


Addihonal probiems in the compilers arose from the decision to 
use the local assemblers, loaders, and library editors on the host 
operating systems. Surprisingly often, they were unable to handle 
he code most naturally produced by the C compilers. For example 
the sernant.es of possibly initialized external variables in C was quke 
consciously designed to be implementable in a way identical to 
For ran s common blocks to guarantee its portability. It was an 
unpleasant surprise to discover that the Honeywell assembler would 

ffie^Tl i'f UCh b 'r CkS ^ h6nCe 6Xternal Variables > and ‘hat 

inL u t l "? k " ed ' tor P refer red to start external variables on even 
4096-byte boundaries. Software limitations in the target systems 
complicated the compilers and, in one case, the problems with 
external variables just mentioned, forced changes in the C language 

IV. THE UNIX PORTABILITY PROJECT 
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Our project had three major goals: 

(/) T° wr j; e a eomP'ier for C that could be changed without grave 
difficulty to generate code for a variety of machines. 

To refine and extend the C language to make most C pro- 
grams portable to a wide variety of machines, mechanically 
identifying non-portable constructions where possible. 

o revise or recode a substantial portion of UNIX in portable 
7 detecting and isolating machine dependencies, and demon- 
strate its portability by moving it to another machine. 

By pursuing each goal, we hoped to attain a corresponding benefit: 

(/) A C compiler adaptable to other machines (independently of 
UNixl, that puts into practice some recent developments ’in 
the theory of code generation. 


(//) 


(///) 
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(/'/') Improved understanding of the proper design of languages 
that, like C, operate on a level close to that of real machines 
but that can be made largely machine-independent. 

(/■(/) A relatively complete and usable implementation of UNIX on 
at least one other machine, with the hope that subsequent 
implementations would be fairly straightforward. 

We selected the Interdata 8/32 computer to serve as the initial 
target for the system portability research. It is a 32-bit computer 
whose design resembles that of the IBM System/360 and /370 series 
machines, although its addressing structure is rather different; in 
particular, it is possible to address any byte in virtual memory 
without use of a base register. For the portability research, of 
course, its major feature is that it is not a t’Di’-ll. In the longer 
term, we expect to find it especially useful for solving problems, 
often drawn from numerical analysis, that cannot be handled on the 
PDP-11 because of its limited address space. 

Two portability projects besides those referred to above are partic- 
ularly interesting. In the period 1976-1977, T. L. Lyon and his asso- 
ciates at Princeton adapted the UNIX kernel to run in a virtual- 
machine partition under vm/370 on an IBM System/370. 14 Enough 
software was also moved to demonstrate the feasibility of the effort, 
though no attempt was made to produce a complete, working sys- 
tem. In the midst of our own work on the Interdata 8/32, we 
learned that a UNIX portability project, for the similar Interdata 7/32, 
was under way at the University of Wollongong in Australia. 15 Since 
everything we know of this effort was discovered in discussion with 
its major participant, Richard Miller, 16 we will remark only that the 
transportation route chosen was markedly different from ours. In 
particular, an Interdata C compiler was adapted from the PDP-11 
compiler, and was moved as soon as possible to the Interdata, where 
it ran under the manufacturer’s operating system. Then the UNIX 
kernel was moved in pieces, first running with dummy device 
drivers as a task under the Interdata system, and only at the later 
stages independently. This approach, the success of which must be 
scored as a real tour de force, was made necessary by the 100 kilome- 
ters separating the PDP-11 in Sydney from the Interdata in Wol- 
longong. 

4.1 Project chronology 

Work began in the early months of 1977 on the compiler, assem- 
bler, and loader for the Interdata machine. Soon after its delivery at 
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sets already existing on the machine to which it is moved; to prom- 
ise to do so would impossibly complicate the project and, in fact, 
might destroy the usefulness of the result. If UNIX is to be installed 
on a machine, its way of doing business must be accepted as the 
right way; afterwards, perhaps, other software can be made to work. 

♦ 

VI. THE PORTABLE C COMPILER 

The original C compiler for the PDP-11 was not designed to be 
easy to adapt for other machines. Although successful compilers for 
the IBM System/370 and other machines were based on it, much of 
the modification effort in each case, particularly in the early stages, 
was concerned with ridding it of assumptions about the PDP-11. 
Even before the idea of moving UNIX occurred to us, it was clear 
that C was successful enough to warrant production of compilers for 
an increasing variety of machines. Therefore, one of the authors 
(SCJ) undertook to produce a new compiler intended from the start 
to be easily modified. This new compiler is now in use on the IBM 
System/370 under both OS and TSS, the Honeywell 6000, the Inter- 
data 8/32, the SEL86, the Data General Nova and Eclipse, the DEC 
VAX- 11/780, and a Bell System processor. Versions are in progress 
for the Intel 8086 microprocessor and other machines. 

The degree of portability achieved by this compiler is satisfying. 
In the Interdata 8/32 version, there are roughly 8,000 lines of 
source code. The first pass, which does syntax and lexical analysis 
and symbol table management, builds expression trees, and gen- 
erates a bit of machine-dependent code such as subroutine prologues 
and epilogues, consists of 4,600 lines of code, of which 600 are 
machine-dependent. In the second pass, which does the bulk of the 
code generation, 1,000 out of 3,400 lines are machine-dependent. 
Thus, out of a total of 8,000 lines, 1,600, or 20 percent, are 
machine-dependent; the remaining 80 percent are shared with the 
Honeywell, IBM, and other compilers. As the Interdata compiler 
becomes more carefully tuned, the machine-dependent figures will 
rise somewhat; for the IBM, the machine-dependent fraction is 22 
percent; for the Honeywell, 25 percent. 

These figures both overstate and understate the true difficulty of 
moving the compiler. They represent the size of those source files 
that contain machine-dependent code; only a half or a third of the 
lines in many machine-dependent functions actually differ from 
machine to machine, because most of the routines involved remain 
similar in structure. As an example, routines to output branches, 
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various types and their alignment requirements. Some issues, like 
addressability on the IBM 360 and 370 series, cause annoyance, but 
generally there are few problems in this area. 

The calling sequence is very important to the efficiency of the 
result and lakes considerable knowledge and imagination to design 
properly. However, once designed, the calling sequence code and 
the related issue of stack frame layout are easy to cope with in the 
compiler. 

Generating optimal code for arithmetic expressions, even on 
idealized machines, can be shown theoretically to be a nearly intract- 
able problem. For the machines we are given in real life, the prob- 
lem is even harder. Thus, all compilers have to compromise a bit 
with optimality and engage in heuristic algorithms to some extent, in 
order to get acceptably efficient code generated in a reasonable 
amount of time. 

The design of the code generator was influenced by a number of 
goals, which in turn were influenced by recent theoretical work in 
code generation. It was recognized that there was a premium in 
being able to get the compiler up and working quickly; it was also 
felt, however, that this was in many ways less important than being 
able to evolve and tune the compiler into a high-quality product as 
time went on. Particularly with operating system code, a “quick and 
dirty” implementation is simply unacceptable. It was also recog- 
nized that the compiler was likely to be applied to machines not well 
understood by the compiler writer that might have inadequate or 
nonexistent debugging facilities. Therefore, one goal of the com- 
piler was to permit it to be largely self-checking. Rather than pro- 
duce incorrect code, we felt it far preferable for the compiler to 
detect its own inadequacies and reject the input program. 

This goal was largely met. The compiler for the Interdata 8/32 
was working within a couple of weeks after the machine arrived; 
subsequently, several months went by with very little time lost due 
to compiler bugs. The bug level has remained low, even as the 
compiler has begun to be more carefully tuned; many of the bugs 
have resulted from human error (e.g., misreading the machine 
manual) rather than actual compiler failure. 

Several techniques contribute considerably to the general reliabil- 
ity of the compiler. First, a conscious attempt was made to separate 
information about the machine (e.g., facts such as “there is an add 
instruction that adds a constant to a register and sets the condition 
code”) from the strategy, often heuristic, that makes use of these 
facts (e.g., if an addition is to be done, first compute the left-hand 
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operand into a register). Thus, as the compiler evolves, more effort 
can be put into improving the heuristics and the recognition of 
important special cases, while the underlying knowledge about the 
machine operations need not be altered. This approach also 
improves portability, since the heuristic programs often remain 
largely unchanged among similar machines, while only the detailed 

changes^ 6 ^ f ° rmat ° f the instructi °ns (encoded in a table) 

During compilation of expressions, a model of the state of the 
compilation process, including the tree representing the expression 

iW 8 , h 0m .h Cd and | thC StatUS ° f the machine registers, is main- 
aincd by the compiler. As instructions are emitted, the expression 

tree is simplified. For example, the expression a = b + c might first 
be transformed into a = register* b as a load instruction for a is 
generated, then into a = register when an add is produced The 
possible transformations constitute the “facts” about the machine- 
the order in which they are applied correspond to the heuristics! 

en the input expression has been completely transformed into 
no h.ng the expression is compiled. Thus, a good portion of the 
initial design of a new version of the compiler is concerned with 
ma mg the mode within the compiler agree with the actual machine 
by building a table of machine operations and their effects on the 
model. When this is done correctly, one has a great deal of 

confidence that the compiler will produce correct code, if it produces 
any at all. 

Another useful technique is to partition the code generation job 
into pieces that interact only through well-defined paths One 
module worries about breaking up large expressions into manageable 
pieces, and allocating temporary storage locations when needed. 
Another module worries about register allocation. Finally, a third 
modu |e takes each “manageable” piece and the register allocation 
information, and generates the code. The division between these 
pieces is strict; if the third module discovers that an expression is 
"manageable, or a needed register is busy, it rejects the compila- 
on The division enforces a discipline on the compiler which 
while not really restricting its power, allows for fairly rapid debug- 
ging of the compiler output. 8 

The most serious drawback of the entire approach is the difficulty 
of proving any form of “completeness” property for the compiler- 
of demonstrating that the compiler will in fact successfully generate 
code for all legal C programs. Thus, for example, a needed 
transformation might simply be missing, so that there might be no 
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way to further simplify some expression. Alternatively, some 
sequence of transformations might result in a loop, so that the same 
expression keeps reappearing in a chain of transformations. The 
compiler detects these situations by realizing that too many passes 
are being made over the expression tree, and the input is rejected. 
Unfortunately, detection of these possibilities is difficult to do in 
advance because of the use of heuristics in the compiler algorithms. 
Currently, the best way of ensuring that the compiler is acceptably 
complete is by extensive testing. 

6.2 Testing the compiler 

We ordered the Interdata 8/32 without any software at all, so we 
first created a very crude environment that allowed stand-alone pro- 
grams to be run; all interrupts, memory mapping, etc., were turned 
off. The compiler, assembler, and loader ran on the PDP-11, and the 
resulting executable files were transferred to the Interdata for test- 
ing. Primitive routines permitted individual characters to be written 
on the console. In this environment, the basic stack management of 
the compiler was debugged, in some cases by single-stepping the 
machine. This was a painful but short period. 

After the function call mechanism was working, other short tests 
established the basic sanity of simple conditionals, assignments, and 
computations. At this point, the stand-alone environment could be 
enriched to permit input from the console and more informative 
output such as numbers and character strings, so ordinary C pro- 
grams could be run. We solicited such programs, but found few 
that did not depend on the file system or other operating system 
features. Some of the most useful programs at this stage were sim- 
ple games that pitted the computer against a human; they frequently 
did a large amount of computing, often with quite complicated logic, 
and yet restricted themselves to simple input and output. A number 
of compiler bugs were found and fixed by running games. After 
these tests, the compiler ceased to be an explicit object of testing, 
and became instead a tool by which we could move and test the 
operating system. 

Some of the most subtle problems with compiler testing come in 
the maintenance phase of the compiler, when it has been tested, 
declared to work, and installed. At this stage, there may be some 
interest in improving the code quality as well as fixing the occasional 
bug. An important tool here is regression testing; a collection of 
test programs are saved, together with the previous compiler output. 
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Before a new compiler is installed, the new compiler is fed these test 
programs, the new output is compared with the saved output, and 
differences are noted. If no differences are seen, and a compiler bug 
has been fixed or improvement made, the testing process is incom- 
plete, and one or more test programs are added. If differences are 
detected, they are carefully examined. The basic problem is that 
frequently, in attempting to fix a bug, the most obvious repair can 
give rise to other bugs, frequently breaking code that used to work. 
These other bugs can go undetected for some time, and are very 
painful both to the users and the compiler writer. Thus, regression 
tests attempt to guard against introducing new bugs while fixing old 
ones. 

The portable compiler is sufficiently self-checked that many poten- 
tial compiler bugs were detected before the compiler was installed by 
the simple expedient of turning the compiler loose on a large 
amount (tens of thousands of lines) of C source code. Many con- 
structions turned up there that were undreamed of by the compiler 
writer, and often mishandled by the compiler. 

It is worth mentioning that this kind of testing is easily carried out 
by means of the standard commands and features in the UNIX sys- 
tem. In particular, C source programs are easily identified by their 
names, and the UNIX shell provides features for applying command 
sequences automatically to each of a list of files in turn. Moreover, 
powerful utilities exist to compare two similar text files and produce 
a minimal list of differences. Finally, the compiler produces assem- 
bly code that is an ordinary text file readable by all of the usual utili- 
ties. Taken together, these features make it very simple to invent 
test drivers. For example, it takes only a half-dozen lines of input 
to request a list of differences between the outputs of two versions 
of the compiler applied to tens (or hundreds) of source files. 
Perhaps even more important, there is little or no output when the 
compilers compare exactly. On many systems, the “job control 
language required to do this would be so unpleasant as to insure 
that it would not be done. Even if it were, the resulting hundreds 
of pages of output could make it very difficult to see the places 
where the compiler needed attention. 

The design of the portable C compiler is discussed more 
thoroughly in Ref. 17. 

VII. LANGUAGE AND COMPILER ISSUES 

We were favorably impressed, even in the early stages, by the 
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general ease with which C programs could be moved to other 
machines. Some problems we did encounter were related to 
weaknesses in the C language itself, so we undertook to make a few 
extensions. 

C had no way of accounting in a machine-independent way for the 
overlaying of data. Most frequently, this need comes up in large 
tables that contain some parts having variable structure. As an 
invented example, a compiler’s table of constants appearing in a 
source program might have a (lag indicating the type of each con- 
stant followed by the constant’s value, which is either integer or 
floating. The C language as it existed allowed sufficient cheating to 
express the fact that the possible integer and floating value might be 
overlaid (both would not exist at once), but it could not be 
expressed portably because of the inability to express the relative 
sizes of integers and floating-point data in a machine-independent 
way. Therefore, the union declaration was added; it permits such a 
construction to be expressed in a natural and portable manner. 
Declaring a union of an integer and a floating point number reserves 
enough storage to hold either, and forces such alignment properties 
as may be required to make this storage useful as both an integer 
and a floating point number. This storage may be explicitly used as 
either integer or floating point by accessing it with the appropriate 
descriptor tag. 

Another addition was the typedef facility, which in effect allows 
the types of objects to be easily parameterized, typedef is used 
quite heavily in the operating system kernel, where the types of a 
number of different kinds of objects, for example, disk addresses, 
file offsets, device numbers, and times of day, are specified only 
once in a header file and assigned to a specific name; this name is 
then used throughout. Unlike some languages, C does not permit 
definition of new operations on these new types; the intent was 
increased parameterization rather than true extensibility. 

Although the C language did benefit from these extensions, the 
portability of the average C program is improved more by restricting 
the language than by extending it. Because it descended from type- 
less languages, C has traditionally been rather permissive in allowing 
dubious mixtures of various types; the most flagrant violations of 
good practice involved the confusion of pointers and integers. Some 
programs explicitly used character pointers to simulate unsigned 
integers; on the PDP-11 the two have the same arithmetic properties. 
Type unsigned was introduced into the language to eliminate the 
need for this subterfuge. 
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More often, type errors occurred unconsciously. For example, a 
function whose only use of an argument is to pass it to a subfunc- 
tion might allow the argument to be taken to be an integer by 
default. If the top-level actual argument is a pointer, the usage is 
harmless on many machines, but not type-correct and not, in gen- 
eral, portable. 

Violations of strict typing rules existed in many, perhaps most, of 
the programs making up the entire stock of UNIX system software. 
Yet these programs, representing many tens of thousands of lines of 
source code, all worked correctly on the PDP-11 and in fact would 
work on many other machines, because the assumptions they made 
were generally, though not universally, satisfied. It was not feasible 
simply to declare all the suspect constructions illegal. Instead, a 
separate program was written to detect as many dubious coding prac- 
tices as possible. This program, called lint, picks bits of fluff from 
programs in much the same way as the PFORT verifier mentioned 
above. C programs acceptable to lint are guaranteed to be free from 
most common type errors; lint also checks syntax and detects some 
logical errors, such as uninitialized variables, unused variables, and 
unreachable code. 

| There are definite advantages in separating program-checking 

from compilation. First, lint was easy to produce, because it is 
based on the portable compiler and thus shares the machine- 
independent code of the first pass with the other versions of the 
compiler. More important, the compilers, large programs anyway, 
are not burdened with a great deal of checking code which does not 
necessarily apply to the machine for which they are running. A 
good example of extra capability feasible in lint but probably not in 
the compilers themselves is checking for inter-program consistency. 
The C compilers all permit separate compilation of programs in 
several files, followed by linking together of the results, lint 
(uniquely) checks consistency of declarations of external variables, 

/ functions, and function arguments among a set of files and libraries.' 

\ Finally, lint itself is a portable program, identical on all machines. 
Although care was taken to make it easy to propagate changes in the 
machine-independent parts of the compilers with a minimum of 
fuss, it has proved useful for the sometimes complicated logic of lint 
to be totally decoupled from the compilers, lint cannot possibly 
affect their ability to produce code; if a bug in lint turns up, its out- 
put can be ignored and work can continue simply by ignoring the 
spurious complaints. This kind of separation of function is charac- 
teristic of UNIX programs in general. The compiler’s one important 
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job is to generate code; it is left to other programs to print listings, 
generate cross-reference tables, and enforce style rules. 

VIII. THE PORTABILITY OF THE UNIX KERNEL 

The UNIX operating system kernel, or brielly the operating system, 
is the permanently resident program that provides the basic software 
environment for all other programs running on the machine. It 
implements the “system calls” by which user’s programs interact 
with the file system and request other services, and arranges for 
several programs to share the machine without interference. The 
structure of the UNIX operating system kernel is discussed elsewhere 
in this issue. 18 ' 19 

To many people, an operating system may seem the very model 
of a nonportable program, but in fact a major portion of UNIX and 
other well-written operating systems consists of machine- 
independent algorithms: how to create, read, write, and delete files, 
how to decide who to run and who to swap, and so forth. If the 
operating system is viewed as a large C program, then it is reason- 
able to hope to apply the same techniques and tools to it that we 
apply to move more modest programs. 

The UNIX kernel can be roughly divided into three sections 
according to their degree of portability. 

8.1 Assembly-language primitives 

At the lowest level, and least portable, is a set of basic hardware 
interface routines. These are written in assembly language, and con- 
sist of about 800 lines of code on the Interdala 8/32. Some of them 
are callable directly from the rest of the system, and provide ser- 
vices such as enabling and disabling interrupts, invoking the basic 
I/O operations, changing the memory map so as to switch execution 
from one process to another, and transmitting information between 
a user process’s address space and that of the system. Most of them 
are machine-independent in specification, although not implementa- 
tion. Other assembly-language routines are not called explicitly but 
instead intercept interrupts, traps, and system calls and turn them 
into C-style calls on the routines in the rest of the operating system. 

Each time UNIX is moved to a new machine, the assembly- 
language portion of the system must be rewritten. Not only is the 
assembly code itself machine-specific, but the particular features 


provided for memory mapping, protection, and interrupt handling 
and masking differ greatly from machine to machine. In moving 
rom the PDP-11 to the Interdata 8/32, a huge preponderance of the 
bugs occurred in this section. One reason for this is certainly the 
usual sorts of difficulties found in assembly-language programming: 
we wrote loops that did not loop or looped forever, garbled critical 
constants, and wrote plausible-looking but utterly incorrect address 
constructions. Lack of familiarity with the machine led us to 
incmrect assumptions about how the hardware worked, and to 

inefficient use of available status information when things went 
wrong. 

finally, the most basic routines for multi-programming, those that 
pass control from one process to another, turned out (after causing 
months of nagging problems) to be incorrectly specified and actually 
unimplementable correctly on the Interdata, because they depended 
improperly on details of the register-saving mechanism of the calling 
sequence generated by the compiler. These primitives had to be 
redesigned; they are of special interest not only because of the prob- 
lems they caused, but because they represent the only part of the 
system that had to be significantly changed, as distinct from 

; expressed properly, to achieve portability. 

: 

f 

i 

8.2 Device drivers 

The second section of the kernel consists of device drivers, the 
programs that provide the interrupt handling, I/O command process- 
ing, and error recovery for the various peripheral devices connected 
to the machine. On the Interdata 8/32 the total size of drivers for 
the disk, magnetic tape, console typewriter, and remote typewriters 
is about 1100 lines of code, all in C. These programs are, of course 
machine-dependent, since the devices are. 

The drivers caused far fewer problems than did the assembly- 
anguage programs. Of course, they already had working models on 
the PDP-11, and we had faced the need to write new drivers several 
times in the past (there are half a dozen disk drivers for various 
kinds of hardware attached to the PDP-11). In adapting to the Inter- 
dala, the interface to the rest of the system survived unchanged 
and the drivers themselves shared their general structure, and even 
much code, with their PDP-11 counterparts. The problems that 
occurred seem more related to the general difficulty of dealing with 
the particular devices than in expressing what had to be done 

I 
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8.3 The remainder of the system 


The third and remaining section of the kernel is the largest. It is 
all written in C, and for the Interdata 8/32 contains about 7,000 
lines of code. This is the operating system proper, and clearly 
represents the bulk of the code. We hoped that it would be largely 
portable, and as it turned out our hopes were justified. A certain 
amount of work had to be done to achieve portability. Most of it 
was concerned with making sure that everything was declared prop- 
erly, so as to satisfy lint, and with replacing constants by parameters. 
For example, macros were written to perform various unit conver- 
sions previously written out explicitly: byte counts to memory seg- 
mentation units and to disk blocks, etc. The important data types 
used within the system were identified and specified using typedef: 
disk offsets, absolute times, internal device names, and the like. 
This effort was carried out by K. Thompson. 

Of the 7,000 lines in this portion of the operating system, only 
about 350 are different in the Interdata and PDP-11 versions; that is, 
they are 95 percent identical. Most of the differences are traceable 
to one of three areas. 

(/) On the PDP-11, the subroutine call stack grows towards 
smaller addresses, while on the Inlerdata it grows upwards. 
This leads to different code when increasing the size of a user 
stack, and especially when creating the argument list for an 
inter-program transfer (exec system call) because the argu- 
ments arc placed on the slack. 

(/'/) The details of the memory management hardware on the two 
machines are different, although they share the same general 
scheme. 

{Hi) The routine that handles processor traps (memory faults, etc.) 
and system calls is rather different in detail on the two 
machines because the set of faults is not identical, and 
because the method of argument transmission in system calls 
differs as well. 

We are extremely gratified by the ease with which this portion of 
the system was transferred. Only a few problems showed up in the 
code that was not changed; most were in the new code written 
specifically for the Interdata. In other words, what we thought 
would be portable did in fact move without trouble. 

Not everything went perfectly smoothly, of course. Our first set 
of major problems involved the mechanics of transferring test 
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systems and other programs from the PDP-11 to the Interdata 8/32 
an ebugging the result. Better communications between the 
machines would have helped considerably. For a period, installing a 

floor hImP SyStCm r ant Crea ‘ ing aP 800 BPI tape on the sixfh- 
floo. I DP-11 carrying the tape to another PDP-11 on the first floor to 

fifthflo I 3 , Th f r i0n ’ and fmally ,Ug8i "S the result to the 
fth-floor Interdata. For debugging, we would have been much 

aided by a hardware interface between the PDP-11 and the front 
lems is basl t0 alI ; w remote rebootin S- This class of prob- 

ea Te of no hi ^ that WC traded the momentary 

se of not having to write communications software or build 

on drugging C ° ntinuing ann °yance of carrying tapes and hands- 

Another class of problems seems impossible to avoid since it 
stems from the basic differences in the representation of’ informa- 
tion on the two machines. In the machines at issue, only one 
difference is important: the PDP-11 addresses the two bytes in a 16- 
bit word with the first byte as the least significant 8 bits, while on 
the Interdata the first byte in a 16-bit half-word is the most 
significant 8 bits Since all the interfaces between the two machines 
are by.e-senal, the effect is best described by saying that when a 
hue character stream is transmitted between them, all is well- but if 
integers are sent, the bytes in each half-word must be swapped 

which 6 ' H h u S Pr ° biem d0CS n0t ' nV0lve P° rtabilit y >n the sense in 
which it has been used throughout this paper; very few C programs 

are sensitive to the order in which bytes are stored on the machine 

on which they arc running. Instead it complicates “portability” in its 

roo meamng wherei" fi les are carried from one ' machine to the 

other. Thus, for example, during the initial creation of the Interdata 

system we were obliged to create, on the PDP-11, an image of a file 

Interdat dlS r 1°^ W0U ' d b<3 C ° pied l ° tape and thence to th e 

hlter nnchinc ^ '■ 38 3 " 3CtUal file System for the 

th! h ? required 3 certain amount of cleverness to declare 
the data structures appropriately and to decide which bytes to swap 

unusual^brnTh ° f ^ 8 W ° rd ° n the PdM1 is somewhat 

unusua!, but the problem ,t poses is quite representative of the 

difficulties of transferring encoded information from machine to 

machine. Another example is the difference in representation of 

floating-point numbers between the PDP-11 and the Interdata The 

assembler for the Interdata, when it runs on the PDP-11 must 

invoke a routine to convert the “natural” PDP-11 notation to the 

foreign notation, but of course this conversion must not be done 
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when the assembler is run on the Interdata itself. This makes the 
assembler necessarily non-portable, in the sense that it must execute 
different code sequences on the two machines. However, it can 
have a single source representation by taking advantage of condi- 
tional compilation depending on where it will run. 

This kind of problem can get much worse: how are we to move 
UNIX to a target machine with a 36-bit word length, whose machine 
word cannot even be represented by long integers on the PDP-11? 
Nevertheless, it is worth emphasizing that the problem is really 
vicious only during the initial bootstrapping phase; all the software 
should run properly if only it can be moved once! 

IX. TRANSPORTATION OF THE SOFTWARE 

Most UNIX code is in neither the operating system itself nor the 
compiler, but in the many user-level utilities implementing various 
commands and in subroutine libraries. The sheer bulk of the pro- 
grams involved (about 50,000 lines of source) meant that the 
amount of work in transportation might be considerable, but our 
early experience, together with the small average size of each indivi- 
dual program, convinced us that it would be manageable. This 
proved to be the case. 

Even before the advent of the Interdata machine, it was realized, 
as mentioned above, that many programs depended to an undesir- 
able degree not only on UNIX I/O conventions but on details of par- 
ticularly favorable buffering strategies for the PDP-11. A package of 
routines, called the “portable I/O library,” was written by M. E. 

Lesk 20 and implemented on the Honeywell and IBM machines as 
well as the PDP-11 in a generally successful effort to overcome the 
deficiencies of earlier packages. This library too proved to have 
some difficulties, not in portability, but in time efficiency and space 
required. Therefore a new package of routines, dubbed the “stan- 
dard I/O library,” was prepared. Similar in spirit to the portable 
library, it is somewhat smaller and much faster. Thus, part of the 
effort in moving programs to the Interdata machine was devoted to 
making programs use the new standard I/O library. In the simplest 
cases, the effort involved was nil, since the fundamental character 
I/O functions have the same names in all libraries. 

Next, each program had to be examined for visible lack of porta- 
bility. Of course, lint was a valuable tool here. Programs were also 
scrutinized by eye to detect dubious constructions. Often these 
involved constants. For example, on the 16-bit PDP-11 the 

I 
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x & 0177770 

masks off all but the last three bits of x, since 0177770 is an octal 
constant. This is almost certainly better expressed 

x & —07 

(where — is the ones-complement operator) because the latter 
expression actually does yield the last three bits of x independently 
of the word length of the machine. Better yet, the constant should 
be a parameter with a meaningful name. 

UNIX software has a number of conventional data structures, rang- 
ing from objects returned or accepted by the operating system kernel 
(such as status information for a named file) to the structure of the 
header of an executable file. Programs often had a private copy of 
the declaration for each such structure they used, and often the 
declaration was nonportable. For example, an encoded file mode 
might be declared int on the 16-bit PDP-11, but on the 32-bit Inter- 
data machine, it should be specified as short, which is unambigu- 
ously 16 bits. Therefore, another major task in making the software 
portable was to collect declarations of all structures common to 
several routines, to put the declarations in a standard place, and to 
use the include facility of the C preprocessor to insert them in the 
source program. The compiler for the PDP-11 and the cross- 
compiler for the Interdata 8/32 were adjusted to search a different 
standard directory to find the canned declarations appropriate to 
each. 

Finally, an effort was made to seek out frequently occurring 
patches of code and replace them by standard subroutines, or create 
new subroutines where appropriate. It turned out, for example, that 
several programs had built-in subroutines to find the printable user 
name corresponding to a numerical user id. Although in each case 
the subroutine as written was acceptably portable to other machines, 
the function it performed was not portable in time across changes in 
the format of the file, describing the name-number correspondence; 
encapsulating the translation function insulated the program against 
possible changes in a data base. 

X. THE MACHINE MODEL FOR C 

One of the hardest parts of designing a language in which to write 
portable programs is deciding which properties are guaranteed to 
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remain invariant. Likewise, in trying to develop a portable operating 
system, it is very hard to decide just what properties of the underly- 
ing machine can be depended on. The design questions in each case 
are many in number; moreover, the answer to each individual ques- 
tion may involve tradeoffs that are difficult to evaluate in advance. 
Here we try to show the nature of these tradeoffs and what sort of 
compromises are required. 

Designing a language in which every program is portable is actu- 
ally quite simple: specify precisely the meaning of every legal pro- 
gram, as well as what programs are legal. Then the portability prob- 
lem does not exist: by definition, if a correct program fails on some 
machine, the language has not been implemented properly. Unfor- 
tunately, a language like C that is intended to be used for system 
programming is not very adaptable to such a Procrustean approach, 
mainly because reasonable efficiency is required. Any well-defined 
language can be implemented precisely on any general-purpose com- 
puter, but the implementation may not be usable in practice if it 
implies use of an interpreter rather than machine instructions. 
Thus, with both language and operating system design, one must 
strike a balance between convenient and powerful features and the 
ease of implementing them efficiently on a variety of machines. At 
any point, some machine may be found on which some feature is 
very expensive to provide, and a decision must be made whether to 
modify the feature, and thus compromise the portability of programs 
that use it, or to insist that the meaning is immutable and must be 
preserved. In the latter case portability is also compromised since 
the cost of using the feature may be so high that no one can afford 
the programs that use it, or the people attempting to implement the 
feature on the new machine give up in despair. 

Thus a language definition implies a model of the machine on 
which programs in the language will run. If a real machine con- 
forms well to the model, then an implementation on that machine is 
likely to be efficient and easily written; if not, the implementation 
will be painful to provide and costly to use. Here we shall consider 
the major features of the abstract C machine that have turned out to 
be most relevant so far. 

10.1 Integers 

Probably the most frequent operations are on integers consisting 
of various numbers of bits. Variables declared short are at least 16 
bits in length; those declared long are at least 32 bits. Most are 
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declared int, and must be at least as precise as short integers, but 
may be long if accessing them as such is more efficient. It is 
interesting that the word length, which is one of the machine 
differences that springs first to mind, has caused rather little trouble. 
A small amount of code (mostly concerned with output conversion) 
assumes a twos complement representation. 


10.2 Unsigned integers 

; 

Unsigned integers corresponding to short and int must be pro- 
vided. The most relevant properties of unsigned integers appear 
when they are compared or serve as numerators in division and 
remaindering. Unsigned arithmetic may be somewhat expensive to 
; implement on some machines, particularly if the number representa- 

tion is sign-magnitude or ones complement. No use is made of 
< unsigned long integers. 


1 0.3 Characters 

• A representation of characters (bytes) must be provided with at 

least 8 bits per byte. It is irrelevant whether bytes are signed, as in 
the PDP-11, or not, as in all other known machines. It is moderately 
important that an integer of any kind be divisible evenly into bytes. 
Most programs make no explicit use of this fact, but the I/O system 
uses it heavily. (This tends to rule out one plausible representation 
of characters on the dec pdp-10, which is able to access five 7-bit 
characters in a 36-bit word with one bit left over. Fortunately, that 
machine can access four 9-bit characters equally well.) Almost all 
programs are independent of the order in which the bytes making up 
an integer are stored, but see the discussion above on this issue. 

A fair number of programs assume that the character set is ASCII. 
Usually the dependence is relatively minor, as when a character is 
tested for being a lower case letter by asking if it is between a and z 
(which is not a correct test in ebcdic). Here the test could be easily 
replaced by a call to a standard macro. Other programs that use 
characters to index a table would be much more difficult to render 
insensitive to the character set. ASCII is, after all, a U. S. national 
standard; we are inclined to make it a UNIX standard as well, while 
not ruling out C compilers for other systems based on other charac- 
ter sets (in fact the current IBM System/370 compiler uses EBCDIC). 

( 

{ 
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10.4 Pointers 


XI. THE MACHINE MODEL OF UNIX 


Pointers to objects of the various basic types are used very 
heavily. Frequent operations on pointers include assignment, com- 
parison, addition and subtraction of an integer, and dereferencing to 
yield the object to which the pointer points. It was frequently 
assumed in earlier UNIX code that pointers and integers had a similar 
representation (for example, that they occupied the same space). 
Now this assumption is no longer made in the programs that have 
been moved. Nevertheless, the representation of pointers remains 
very important, particularly in regard to character pointers, which 
are used freely. A word-addressed machine that lacks any natural 
representation of a character pointer may suffer serious inefficiency 
for some programs. 


10.5 Functions and the calling sequence 

UNIX programs tend to be built out of many small, frequently 
called functions. It is not unusual to find a program that spends 20 
percent of its time in the function prologue and epilogue sequence, 
nor one in which 20 percent of the code is concerned with preparing 
function argument lists. On the PDP-11/70 the calling sequence is 
relatively efficient (it costs about 20 microseconds to call and return 
from a function) so it is clear that a less efficient calling sequence 
will be quite expensive. Any function in C may be recursive 
(without special declaration) and most possess several “automatic” 
variables local to each invocation. These characteristics suggest 
strongly that a stack must be used to store the automatic variables, 
caller’s return point, and saved registers local to each function; in 
turn, the attractiveness of an implementation will depend heavily on 
the ease with which a stack can be maintained. Machines with too 
few index or base registers may not be able to support the language 
well. 

Efficiency is important in designing a calling sequence; moreover, 
decisions made here tend to have wide implications. For example, 
some machines have a preferred direction of growth for the stack. 

On the PDP-11, the stack is practically forced to grow towards 
smaller addresses; on the Interdata the stack prefers (somewhat 
more weakly) to grow upwards. Differences in the direction of stack 
growth leads to differences in the operating system, as has already 
been mentioned. 

\ 
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The definition of C suggests that some machines are more suitable 
for C implementations than others; likewise, the design of the UNIX 
kernel fits in well with some machine architectures and poorly with 
others. Once again, the requirements are not absolute, but a serious 
enough mismatch may make an implementation unattractive. 
Because the system is written in C, of course, a (perhaps neces- 
sarily) slow or bulky implementation of the language will lead to a 
slow or bulky operating system, so the remarks in the previous sec- 
tion apply. But other aspects of machine design are especially 
relevant to the operating system. 


1 1.1 Mapping and the user program 

As discussed in other papers, 18 ’ 21 the system provides user pro- 
grams with an address space consisting of up to three logical seg- 
ments containing the program text, an extensible data region, and a 
stack. Since the stack and the data are both allowed to grow at one 
edge, it is desirable (especially where the virtual address space is 
limited) that one grow in the negative direction, towards the other, 
so as to optimize the use of the address space. A few programs still 
assume that the data space grows in the positive direction (so that 
an array at its end can grow contiguously), although we have tried to 
minimize this usage. If the virtual address space is large, there is 
little loss in allowing both the data and stack areas to grow upwards. 

The PDP-11 and the Interdata provide examples of what can be 
done. On the former machine, the data area begins at the end of 
the program text and grows upwards, while the stack begins at the 
end of the virtual address space and grows downwards; this is, hap- 
pily, the natural direction of growth for the stack. On the Interdata 
the data space begins after the program and grows upwards; the 
stack begins at a fixed location and also grows upwards. The layout 
provides for a stack of at most 128K bytes and a data area of 852K 
bytes less the program size, as compared to the total data and stack 
space of 64K bytes possible on the PDP-11. 

It is hard to characterize precisely what is required of a memory 
mapping scheme except by discussing, as we do here, the uses to 
which it is put. In general, paging or segmentation schemes seem to 
offer sufficient generality to make implementation simple; a single 
base and limit register (or even dual registers, if it is desired to 
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write-protect the program text) are marginal, because of the 
difficulty of providing independently growable data and stack areas. 

1 1 .2 Mapping and the kernel 

When a process is running in the UNIX kernel, a fixed region of 
the kernel’s address space contains data specific to that process, 
including its kernel stack. Switching processes essentially involves 
changing the address map so that the same fixed range of virtual 
addresses refers to the data area and stack of the new process. This 
implies, of course, that the kernel runs in mapped mode, so that 
mapping should not be tied to operating in user mode. It also 
means that if the machine has but a single set of mapping 
specification registers, these registers will have to be reloaded on 
each system call and certain interrupts, for example from the clock. 

This causes no logical problems but may affect efficiency. 

1 1.3 Other considerations 

Many other aspects of machine design are relevant to implementa- 
tion of the operating system but are probably less important, 
because on most machines they are likely to cause no difficulty. 

Still, it is worthwhile to attempt a list. 

(/') The machine must have a clock capable of generating inter- 
rupts at a rate not far from 50 or 60 Hz. The interrupts are 
used to schedule internal events such as delays for mechanical 
motion on typewriters. As written, the system uses clock 
interrupts to maintain absolute time, so the interrupt rate 
should be accurate in the long run. However, changes to con- 
sult a separate time-of-day clock would be minimal. 

(//) All disk devices should be able to handle the same, relatively 
small, block sizes. The current system usually reads and 
writes 512-byte blocks. This number is easy to change, but if 
it is made much larger, the efficacy of the system’s cache 
scheme will degrade seriously unless a large amount of t 

memory is devoted to buffers. 

XII. WHAT HAS BEEN ACCOMPLISHED? 

In about six months, we have been able to move the UNIX operat- 
ing system and much of its software from its original host, the PDP- 
11, to another, rather different machine, the Interdata 8/32. The 
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standard of portability achieved is fairly high for such an ambitious 
project: the operating system (outside of device drivers and assem- 
bly language primitives) is about 95 percent unchanged between the 
two systems; inherently machine-dependent software such as the 
compiler, assembler, loader, and debugger are 75 to 80 percent 
unchanged; other user-level software (amounting to about 20,000 
lines so far) is identical, with few exceptions, on the two machines. 

It is true that moving a program from one machine to another 
does not guarantee that it can be moved to a third. There are many 
issues in portability about which we worried in a theoretical way 
without having to face them in fact. It would be interesting, for 
example, to tackle a machine in which pointers were a different size 
from integers, or in which character pointers were fundamentally 
different in structure from integer pointers, or with a different char- 
acter set. There are probably even issues in portability that we 
failed to consider at all. Nevertheless, moving UNIX to a third new 
machine, or a fourth, will be easier than it was to the second. The 
operating system and the software have been carefully parameter- 
ized, and this will not have to be done again. We have also learned 
a great deal about the critical issues (the “hard parts”). 

There are deeper limitations to the generality of what we have 
done. Consider the use of memory mapping: if the hardware cannot 
support the model assumed by the code as it is written, the code 
must be changed. This may not be difficult, but it does represent a 
loss of portability. Correspondingly, the system as written does not 
take advantage of extra capability beyond its model, so it does not 
support (for example) demand paging. Again, this would require 
new code. More generally, algorithms do not always scale well; the 
optimal methods of sorting files of ten, a thousand, and a million 
elements do not much resemble one another. Likewise, some of 
the design of the system as it exists may have to be reworked to 
take full advantage of machines much more powerful (along many 
possible dimensions) than those for which it was designed. This 
seems to be an inherent limit to portability; it can only be handled 
by making the system easy to change, rather than easily portable 
unchanged. Although we believe UNIX possesses both virtues, only 
the latter is the subject of this paper. 
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